Stochastic mixed model sequencing with multiple stations using reinforcement learning and probability quantiles
نویسندگان
چکیده
Abstract In this study, we propose a reinforcement learning (RL) approach for minimizing the number of work overload situations in mixed model sequencing (MMS) problem with stochastic processing times. The environment simulates times and penalizes overloads negative rewards. To account component problem, implement state representation that specifies whether will occur if are equal to their respective 25%, 50%, 75% probability quantiles. Thereby, RL agent is guided toward while being provided statistical information about how fluctuations affect solution quality. best our knowledge, study first consider variation minimization situations.
منابع مشابه
A Multi-Objective Mixed-Model Assembly Line Sequencing Problem With Stochastic Operation Time
In today’s competitive market, those producers who can quickly adapt themselves todiverse demands of customers are successful. Therefore, in order to satisfy these demands of market, Mixed-model assembly line (MMAL) has an increasing growth in industry. A mixed-model assembly line (MMAL) is a type of production line in which varieties of products with common base characteristics are assembled o...
متن کاملMultiple Model-Based Reinforcement Learning
We propose a modular reinforcement learning architecture for nonlinear, nonstationary control tasks, which we call multiple model-based reinforcement learning (MMRL). The basic idea is to decompose a complex task into multiple domains in space and time based on the predictability of the environmental dynamics. The system is composed of multiple modules, each of which consists of a state predict...
متن کاملReinforcement Learning using Kernel-Based Stochastic Factorization
Kernel-based reinforcement-learning (KBRL) is a method for learning a decision policy from a set of sample transitions which stands out for its strong theoretical guarantees. However, the size of the approximator grows with the number of transitions, which makes the approach impractical for large problems. In this paper we introduce a novel algorithm to improve the scalability of KBRL. We resor...
متن کاملReinforcement Learning with Multiple Demonstrations
Many tasks in robotics can be described as a trajectory that the robot should follow. Unfortunately, specifying the desired trajectory is often a non-trivial task. For example, when asked to describe the trajectory that a helicopter should follow to perform an aerobatic flip, one would have to not only (a) specify a complete trajectory in state space that intuitively corresponds to the aerobati...
متن کاملReinforcement Learning by Probability Matching
We present a new algorithm for associative reinforcement learning. The algorithm is based upon the idea of matching a network's output probability with a probability distribution derived from the environment's reward signal. This Probability Matching algorithm is shown to perform faster and be less susceptible to local minima than previously existing algorithms. We use Probability Matching to t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: OR Spectrum
سال: 2021
ISSN: ['0171-6468', '1436-6304']
DOI: https://doi.org/10.1007/s00291-021-00652-x